DynamiTE: Parallel Materialization of Dynamic RDF Data

نویسندگان

  • Jacopo Urbani
  • Alessandro Margara
  • Ceriel J. H. Jacobs
  • Frank van Harmelen
  • Henri E. Bal
چکیده

One of the main advantages of using semantically annotated data is that machines can reason on it, deriving implicit knowledge from explicit information. In this context, materializing every possible implicit derivation from a given input can be computationally expensive, especially when considering large data volumes. Most of the solutions that address this problem rely on the assumption that the information is static, i.e., that it does not change, or changes very infrequently. However, the Web is extremely dynamic: online newspapers, blogs, social networks, etc., are frequently changed so that outdated information is removed and replaced with fresh data. This demands for a materialization that is not only scalable, but also reactive to changes. In this paper, we consider the problem of incremental materialization, that is, how to update the materialized derivations when new data is added or removed. To this purpose, we consider the ρdf RDFS fragment [12], and present a parallel system that implements a number of algorithms to quickly recalculate the derivation. In case new data is added, our system uses a parallel version of the well-known semi-naive evaluation of Datalog. In case of removals, we have implemented two algorithms, one based on previous theoretical work, and another one that is more efficient since it does not require a complete scan of the input. We have evaluated the performance using a prototype system called DynamiTE , which organizes the knowledge bases with a number of indices to facilitate the query process and exploits parallelism to improve the performance. The results show that our methods are indeed capable to recalculate the derivation in a short time, opening the door to reasoning on much more dynamic data than is currently possible.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing RDF Stores by Coupling General-purpose Graphics Processing Units and Central Processing Units

From our experience in using RDF stores as a backend for social media streams, we pinpoint three shortcomings of current RDF stores in terms of aggregation speed, constraints checking and largescale reasoning. Parallel algorithms are being proposed to scale reasoning on RDF graphs. However the current efforts focus on the closure computation using High Performance Computing (HPC) and require pr...

متن کامل

Parallel Sort-merge-join Reasoning

We present an in-memory, cross-platform, parallel reasoner for RDFS and RDFSPlus . Inferray uses carefully optimized hash-based join and sorting algorithms to perform parallel materialization. Designed to take advantage of the architecture of modern CPUs, Inferray exhibits a very good uses of cache and memory bandwidth. It offers state-of-theart performance on RDFS materialization, outperforms ...

متن کامل

QSMat: Query-Based Materialization for Efficient RDF Stream Processing

This paper presents a novel approach, QSMat, for efficient RDF data stream querying with flexible query-based materialization. Previous work accelerates either the maintenance of a stream window materialization or the evaluation of a query over the stream. QSMat exploits knowledge of a given query and entailment rule-set to accelerate window materialization by avoiding inferences that provably ...

متن کامل

Incremental Reasoning on Streams and Rich Background Knowledge

This article presents a technique for Stream Reasoning, consisting in incremental maintenance of materializations of ontological entailments in the presence of streaming information. Previous work, delivered in the context of deductive databases, describes the use of logic programming for the incremental maintenance of such entailments. Our contribution is a new technique that exploits the natu...

متن کامل

Reasoning in RDFS is Inherently Serial, At Least in The Worst Case

There have recently been several papers presenting scalable distributed inference systems for the W3C Resource Description Framework Schema language (RDFS). These papers have made claims to the effect that they can produce the RDFS closure for billions of RDF triples using parallel hardware. For example, Urbani et al claim “a distributed technique to perform materialization under the RDFS [. . ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013